-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(bin): add criterion benchmarks #1758
Conversation
Better performance (regression) testing is definitely something we are interested in having. One part of this is the building-up of a suite of benchmarks that we run in CI. I envision a process where if we identify a performance issue, we first quantify it through some (micro)benchmarks that from then on become part of the suite. If, based on the benchmark results, we decide that a performance issue needs to be fixed, we do that. A second part is macro-level benchmarks, e.g., end-to-end throughput and other tests. I haven't spend much time on how we should best do this. At the moment, I feel that some simple end-to-end tests, like you propose here or we have in |
Making some progress. Actual numbers are still preliminary.
|
29bed4c
to
b532192
Compare
Wraps the `neqo-client` and `neqo-server` code, starts the server and runs various benchmarks through the client. Benchmarks: - single-request-1gb - single-request-1mb - requests-per-second - handshakes-per-second
Benchmark resultsPerformance differences relative to efc4813.
Client/server transfer resultsTransfer of 134217728 bytes over loopback.
|
It just takes too long on the bench machine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pull request is ready for a first review.
Is there a way to group the benchmarks, so we can pin those where it makes sense to a core, and let others such as this one be scheduled on multiple cores? |
This reverts commit a0ef46a.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1758 +/- ##
==========================================
- Coverage 93.07% 93.05% -0.02%
==========================================
Files 117 117
Lines 36449 36374 -75
==========================================
- Hits 33926 33849 -77
- Misses 2523 2525 +2 ☔ View full report in Codecov by Sentry. |
@larseggert mind taking another look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you break out the renaming of ClientError
and ServerError
to Error
into its own PR? Are there any other chores that should not be part of this one?
This pull request moves Long story short, these renames are not unrelated. Do you still want me to separate them out into a separate pull request @larseggert? If so, shall I include the file moving in the separate pull request as well?
In general, very much in favor of atomic tightly scoped pull requests. That said, in my eyes all changes are related to the pull request. |
Latest benchmark run results are now within the same ballpark as when run on my laptop.
|
Wraps the
neqo-client
andneqo-server
code, starts the server and runs various benchmarks through the client.Benchmarks:
It would be great to have end-to-end benchmarks (in CI) to land performance optimizations like #1741 with confidence.
This pull request is a shot at the above using criterion.
Benefits:
bench.yml
workflow including its baseline comparisonObviously there are a million other ways to do this.
upload_test.sh
simulating an actual network (latency, packet drop, ...), i.e. not just loopback.What are people's thoughts? Are there long-term plans for neqo? Are there prominent examples worth taking inspiration from? Does Firefox / Mozilla already have some continuous benchmarking setup worth hooking into?